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Human activities can play a crucial role in the statistical properties of observables in 
many complex systems such as social, technological and economic systems. We demon- 
strate this by looking into the heavy-tailed distributions of observables in fatal plane 
and car accidents. Their origin is examined and can be understood as stochastic pro- 
cesses that are related to human activities. Simple mathematical models are proposed 
to illustrate such processes and compared with empirical results obtained from existing 
databanks. 

Keywords: Heavy-tailed distributions; human activities; stochastic processes; traffic ac- 
cidents. 

PACS Nos.: 02.50.-r, 89.75.Da, 89.40.-a 



1. Introduction 

Many complex systems exhibit heavy-tailed distributions in observables that char- 
acterize the systems. Among them are natural hazards such as earthquakes, land- 
slides, wildfires - or man-made disasters like warfares and global terrorising^, etc. 
Over the years, researchers have been asking if the heavy-tailed distributions ob- 
served in complex systems imply something interesting or whether there is a simple 
explanation to this kind of phenomena. If heavy-tailed distributions do appear in 
some of the observables of the system, how should one proceed the analysis and 
determine whether or not it just results from some stochastic processes. Until very 
recently, researchers have suggested that one of the mecha nisms to produce such 
interesting phenomena is indeed related to human activities SEE] j n these works, 
the researchers identified the heavy-tailed distributions of the observables as results 
of human activities. In particular, they have studied the time interval of occurrence 
in various cases such as rating of movies, e-mail communication, etc. We notice that 
these studies are concerning the effect of human activities on the temporal behavior 
of the above cases. It would be therefore natural to ask if human activities can also 
affect observables other than the temporal statistics of these systems; why and how 
they are affected. 
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In this paper, we will investigate the role of human activities in the distribution 
of observables in complex systems associated with human dynamics. We will make 
an attempt by studying the distribution of observables such as fatalities and time 
intervals of occurrences in fatal traffic accidents and try to understand the role of 
human activities in these systems. We will show that the patterns of human activities 
do affect temporal observables as well as other kinds of observables in these systems. 
Understanding the origin of the heavy-tailed distributions of observables in these 
systems should give us insight to analyze and understand many other systems, such 
as those mentioned above. Two kinds of fatal traffic accidents are considered here, 
namely, plane and car accidents. Most people would agree that accidents should be 
understood as some kind of Poisson stochastic processes and it is hard to imagine 
how they can have correlations with each other. In a large scale, a traffic accident 
occurs in New York should have nothing to do with the occurrence of a traffic 
accident in Moscow. However, as we will show below, the distributions of some 
observables in fatal traffic accidents indeed exhibit interesting phenomena similar to 
systems which possess heavy-tailed distributions in observables that are commonly 
believed to be governed by some deep underlying principle. One would therefore 
eager to know if there are also deeper reasons for such kind of phenomena in fatal 
traffic accidents. 

In recent years, researchers ha ve lo oked into the subject of traffic accidents and 
studied their temporal properties I^ISJ. In the case of plane accidents, the authors 
of Ref. 7 found that the time lag between commercial airline disasters and their 
occurrence frequency could be well described by time-dependent Poisson events. On 
the other hand, authors of Ref. 8 have found that beyond certain timescales the time 
dynamics of both plane and car accidents are not Poissonian but instead long-range 
correlated. In the study of accidents, the quantities that one can measure are the 
fatalities in an accident, the number of accidents or fatalities within a certain period 
and the time interval between consecutive accidents, etc. One can therefore proceed 
to investigate many of these observables aside from the temporal properties and to 
see if they all display heavy-tailed distributions. We will here show the empirical 
result of these quantities from the databanks that we obtain from the web. Although 
both plane and car accidents that people studied are classified as traffic accidents, 
they are indeed very different in nature. First of all, plane accidents that people 
have analyzed are global events, i.e., they keep record of all plane accidents around 
the world. However, in the car accidents, due to the limitation of empirical data, 
people usually study the case within a certain region. One should therefore argue 
that the occurrence of air traffic accidents shall not be so much affected by factors 
such as human circadian and weekly cycles which affect car accidents. On the other 
hand, the number of passengers that an aircraft can take varies from a few to a 
few hundred while a car usually only takes a few passengers. Therefore, one would 
expect that plane and car accidents should have very different behaviors in temporal 
as well as other properties. We will show below that the results from empirical data 
indicate that they are indeed very different. This paper is organized as follows. In 
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Section[2j we present empirical data of plane and car accidents which exhibit heavy- 
tailed distributions in some of the observables studied. We then give an explanation 
that can reproduce these results based on human activities and illustrate this with 
a simple mathematical model in Section [3] Section [4] is the conclusion. 



2. Empirical Data 
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Fig. 1. (a) A plot of the frequency distribution vs. the fatalities, x, in a plane accident during 
the period 1978-2007. The lognormal and power-law fits are also drawn for the comparison, (b) 
Cumulative counts of plane accidents vs. time interval, At, between consecutive plane accidents 
during the same period. The exponential fit gives a time constant, t, equal to 5.12 days. 



Fig. [fla) is the frequency distribution of th e number of fatalities x involved in 
plane accidents from 1978 through 2007 One can observe that the empiri- 

cal distribution deviates from a lognormal one but rather exhibits a power-law-likc 
(~ ir~ 7 ) behavior between x — 5 to 200 and drops quickly due to finite size ef- 
fect (largest possible capacity of current airplanes). We further divide the plane 
accidents during this period into 3 different sub-periods, each lasts for a period of 
10 years. The power-law-like behavior in each of the three curves still holds, with 
exponents 7 ranging between 1.4 and 1.6. We also note that there is a tendency 
that the exponents are getting smaller as time goes on. Fig. \V[b) is a semi-log plot 
of the frequency distribution of time intervals between consecutive accidents in the 
same period. A straight line fit with a time constant, t, of about 5 days is obtained. 
This means that the cumulative counts of plane accidents decay exponentially with 
respect to the time interval between consecutive plane accidents, indicating their 
Poissonian nature. We further analyze our data by means of the Allen Factor (AF) 
statistics which has been applied to study the temporal behavior of traffic acci- 
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dents^H In our analysis, we observe that the AF curve is rather flat up to about five 
hundred days, which implies a Poisson-like behavior within such timescales. The 
AF fluctuates severely for larger timescales due to the low statistics of our data. 
No solid conclusion can therefore be made for the temporal behavior for timescales 
beyond a few hundred days. One should also keep in mind that in these plots, we 
have ignored incidents from terrorist acts and other incidents, e.g., those who com- 
mitted suicides by jumping out of the airplane during the flight, which we do not 
classify as accidents. In fact, these incidents only accounted for a small fraction of 
all incidents and therefore should not affect the result here. For instance, about 30 
incidents related to terrorist attacks were reported in the past ten years (1998-2007) 
in a total of 600+ incidents, which is about 5% of all incidents. The result therefore 
does not change much by this small fraction. 




daily occurrences daily occurrences 

Fig. 2. Frequency distribution of fatal car accidents in the US for different days of the week 
during the period 1994-2006, (a) Monday-Thursday, (b) Friday- Sunday. The inset is a sum of all 
the distributions in (a) and (b). 

The data of fatal car accidents, however, display a totally different scenario. We 
here use the dataset of fatal car accidents available on the website of Fatality Anal- 
ysis Reporting System (FARS) ^1 This database contains information about fatal 
car accidents within the United States in past years and we present here the data 
from 1994 through 2006 obtained from the website. The inset in Fig. EJ^b) is the plot 
of frequency distribution of fatal car accidents per day from 1994 through 2006. The 
shape is somewhat tilted to the left and can be understood as the effect of weekly 
cycles. If one plots instead the fatal car accidents that occurred during different 
days of the week, the distributions can then be very well described by Gaussian 
distributions. Fig. [2ja) shows plots from Monday through Thursday and Fig. Efb) 
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shows plots from Friday through Sunday. The Gaussian fits for different days of the 
week therefore imply that there should be no correlations among accidents. Thus, 
the tilted curve in the inset of Fig. [2jb) reflects the pattern of weekly cycles in 
US citizens driving habits (activities) during this period. In other countries, take 
Taiwan El for example, the plot of frequency distribution of car accidents per day 
similar to the inset of Fig.[2][b) already gives a good Gaussian fit and the origin can 
be easily traced back to the driving habits there. 

Fig. El(a) is a plot of the time intervals between consecutive accidents while 
Fig- Elk) is the average number of fatal car accidents per hour in a 24- hour span from 
1994 through 2006. The distribution of time intervals between consecutive accidents 
in Fig. EJa) shows a heavy tail significantly different from that of an exponential 
decay. This result suggests that there should be other factors involved which render 
the distribution to differ from that given by a Poisson stochastic process. 
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Fig. 3. (a) A semi-log plot of the frequency distribution of the time interval between consecutive 
fatal car accidents in the US during the period 1994-2006. (b) The average occurrence distribution 
of fatal car accidents in the US in a 24-hour span during the same period. 



3. Model 

In the above, we have observed that in both plane and car accidents, there are 
heavy-tailed distributions in some observables but exponential decays or Gaussian 
distributions in other observables that we investigated. This is however different 
from usual statistical systems where all the observables will exhibit power-law- 
like behavior near its critical point. One would therefore wonder if there are some 
underlying principles which govern the behavior in the observables of these systems. 
We here offer an explanation and illustrate this with a simple model. We believe 
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that the heavy-tailed distributions in these systems are due to stochastic processes 
that are related to human activities. 



3.1. Airplane accidents 

We begin by postulating that the probability for any plane that flies in the sky to 
involve in an accident to be the same. We further assume that the fatalities involved 
in each accident to be a random number, i.e., the numbers of fatalities in an accident 
are uniformly distributed from 1 to N, where N is the largest possible capacity of 
the plane in the accident. This latter assumption is supported by the empirical data 
shown in Fig. [4ja), which are the distributions of fatalities for different makes of 
airplanes. The distributions can be reasonably considered as uniform distributions 
except for the higher frequency of occurrence near zero, which might be due to the 
fact that some of the planes were indeed used as cargo planes or for training, etc 
so that there were only a few passengers on the plane. With these assumptions, we 
try to simulate the distribution of fatalities involved in plane accidents. 
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Fig. 4. (a) The distribution of fatality rate. The capacities of different makes of airplanes are 
all normalized to 1 for comparison, (b) shows the simulated result and the empirical data during 
the period 1978-1987, while (c) shows the empirical cumulative distribution of the capacities of 
airplanes with different weighting. 



Fig. Hfb) is a plot of the simulated result and the empirical data during the 
period 1978-1987. The simulated result gives a power-law behavior with an exponent 
of about 1.2. This simulation is carried out by using the empirical distribution of 
the capacities of airplanes (as shown in Fig. Htc)) that were built and started their 
services before and during the period (1978-87) that we record the plane accidents. 
With the two randomness assumptions that we have, the simulated result does 
reproduce a power-law distribution similar to that of the empirical data. If one 
makes further assumptions, the exponent of the simulated result could be improved 
to a value even closer to the empirical data. For example, there are many more 
domestic flights than international flights, which in turn means that the Boeing 727 
and 737 planes fly more often than the Boeing 747 planes. From our first assumption, 
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this implies there are more chances for a Boeing 727 or 737 to get involved in an 
accident than a Boeing 747. The dashed line in Fig. [4jh) is the simulated result 
with the extra assumption that the ratio of the domestic to international flights 
to be 3:1. The exponent now increases to 1.3. Further assumptions can be made, 
e.g., by giving a variable between and N, the maximum capacity of a plane, 
which represents the number of passengers that were actually on the plane. Since 
our aim is to demonstrate that a power-law-like distribution can result from simple 
randomness assumptions, we will not pursue further on this. We should further 
comment here that as more airplanes were built during the last two decades, the 
exponent of the power-law behavior of the simulated result gradually gets smaller; 
a trend that is consistent with the empirical data shown in Fig. 0Ja). Since we do 
not know how the airline companies replace old airplanes with new ones, this is at 
best a hint that further supports the randomness assumptions. 



3.2. Car accidents 

As shown above, the distributions of the observables studied in car accidents offer 
a totally different scenario. While the distribution of fatalities indicates that it is 
a result of stochastic processes, the time interval plot suggests that there are other 
possibilities. We here propose that the heavy-tailed distribution of time interval is 
indeed due to another pattern of human activities, namely, the human circadian. 

Recall that Fig. [3jb) is a plot of the average number of fatal car accidents per 
hour in a 24-hour span and this value reaches a minimum around 5:00 AM in the 
morning. One can imagine that during this time of the day, most American people 
are still in bed and therefore there are fewer cars on the road than the rest of the 
day. We thus propose that the pattern of human circadian or driving habits during 
a one-day span can be approximated by the following periodic function 

/(t) = a + /3sm(27r|) , (1) 

where t is the time variable, T denotes the period , a is a constant which should be 
normalized according to the event rate and (3 is the amplitude. Although there are 
many possibilities for the periodic function that can match the curve in Fig. EJb), 
Eq.([T]) is the simplest one which can guarantee that f(t) is always positive with 
a suitable choice of a and (3. Eq.JT]) can be integrated out analytically and then 
averaged over, in this case, the 24 hour period. One can then obtain the cumulative 
distribution of At as 
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with 9 = 2nt/T. Fig. 02» is a plot of P> (At) with different a and /3. Notice that for 
certain values of a and j3, Eq.f2]) could in fact exhibit a heavy-tailed distribution 
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which is similar to the power-law behavior. Th e time intervals of consecutive car 
accidents in some countries such as Taiwan indeed display such an interesting 
power-law behavior. Fig. 02b) is a fit of the empirical data in a semi-log plot using 
Eq.([2]) with a = 0.0694 and (3 = 0.0386. The dashed line in the same figure is a 
Monte Carlo simulation of Eq. |T]) , with the same set of a and (3. One can see that the 
empirical data in Fig.[3ja) can be well described by the periodic function introduced 
in Eq.([T]). In addition, we also notice that P > (At) can develop a bump around the 
cut-off when j3 becomes large (which means that there exist large fluctuations in 
the periodic function). We argue that this bump should be observed in some other 
complex systems with longer calm-time intervals. 
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Fig. 5. (a) The plot of P> (At) with different a and /3. (b) Fit of the empirical data using P> (At) 
with a = 0.0694 and /3 = 0.0386. 



4. Conclusion 

In this paper, we have studied the distributions of the temporal properties as well as 
other observables in fatal traffic accidents. In the case of plane accidents, the time 
lag of each event and their number of fatalities could be well described as Poissonian 
and Gaussian stochastic processes. On the other hand, the time lag of car accidents 
and their occurrence frequency do exhibit heavy-tailed behavior which is consistent 
with Ref. [H We further observe that while the distributions of some observables 
do suggest that the dynamics of these systems are stochastic, distributions of other 
observables display instead heavy-tailed behavior. In our study here, the frequency 
distribution of the number of fatalities x involved in plane accidents exhibits power- 
law behavior while the frequency distribution of fatal car accidents indicate that 
these are indeed Poisson stochastic processes. We here propose an explanation based 
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on stochastic processes that are related to human activities. Human activities can 
indeed play a crucial role in the temporal as well as other statistical properties in 
many complex systems such as social, technological and economic systems. This 
is demonstrated explicitly here with the two examples in which the time intervals 
of occurrences as well as the fatalities are heavy-tailed distributed. The role of 
human act ivities in the temporal properties of complex systems has been pointed 
out before SEES We here argue that human activities can affect observables other 
than the temporal behavior of complex systems. The distribution of the capacities 
of airplanes in Fig. (He) is a result of the need of our society which in turn is 
related to human activities. In order to understand their effect on observables in 
a more quantitative manner, we also introduce a periodic function to approximate 
the pattern of human circadian or driving habits in the study of car accidents. 
The empirical data on the temporal properties of car accidents could in fact be 
well described by this function. Modifications of this function such as including 
higher harmonics can be done in a simple manner. We believe that the heavy- 
tailed behavior of observables in many complex systems can in fact be understood 
as stochastic processes that are related to human activities and their temporal 
properties can be well des cribed by periodic functions of this type. Work in this 
direction is in progress ^21. 
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